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Computer science and biology have enjoyed a long and 
fruitful relationship for decades. Biologists rely on computa- 
tional methods to analyze and integrate large data sets, while 
several computational methods were inspired by the high- 
level design principles of biological systems. Recently, these 
two directions have been converging. In this review, we argue 
that thinking computationally about biological processes may 
lead to more accurate models, which in turn can be used to 
improve the design of algorithms. We discuss the similar 
mechanisms and requirements shared by computational and 
biological processes and then present several recent studies 
that apply this joint analysis strategy to problems related to 
coordination, network analysis, and tracking and vision. We 
also discuss additional biological processes that can be 
studied in a similar manner and link them to potential 
computational problems. With the rapid accumulation of data 
detailing the inner workings of biological systems, we expect 
this direction of coupling biological and computational 
studies to greatly expand in the future. 
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Introduction 

Biologists have been increasingly relying on sophisticated 
computational methods, especially over the last two decades 
as molecular data have rapidly accumulated. Computational 
tools for searching large databases, including BLAST (Altschul 
et al, 1990), are now routinely used by experimentalists. 
Genome sequencing and assembly rely heavily on algorithms 
to speed up data accumulation and analysis (Gusfield, 1997; 
Trapnell and Salzberg, 2009; Schatz et al, 2010). Computa- 
tional methods have also been developed for integrating 
various types of functional genomics data and using them to 
create models of regulatory networks and other interactions in 
the cell (Alon, 2006; Huttenhower et al 2009; Myers et al, 
2009). Indeed, several computational biology departments 



have been established over the last few years that are focused 
on developing additional computational methods to aid in 
solving life science's greatest mysteries. 

Computer scientists have also relied on biological systems 
for inspiration, especially when developing optimization 
methods (Table I). Early work on artificial intelligence in the 
1960s leveraged ideas related to the activity of neurons in the 
brain to develop a class of computational methods known as 
neural networks (Hebb, 1949; Hopfield, 1982; Bishop, 1996), 
which have been used in many machine learning applications 
ranging from image segmentation to computing missile 
trajectories (Bishop, 1996). Other optimization techniques, 
such as genetic algorithms (Goldberg, 1989), were inspired by 
common operations in DNA sequence evolution and have 
been widely applied over the last 20 years. Social insects 
(Dorigo et al, 2006) and particle swarms (Kennedy and 
Eberhart, 2002) have also motivated the study of how self- 
organization emerges from local interactions (Abelson et al, 
2000; Committee on Frontiers at the Interface of Computing, 
Biology, and National Research Council, 2005). These ideas 
have been applied extensively to multi-agent system optimiza- 
tion (Deneubourg et al, 1990; Ferber, 1999). A number of 
additional methods, including non-negative matrix factoriza- 
tion (Lee and Seung, 1999) and population protocols (Aspnes 
and Ruppert, 2009; Chazelle, 2009) have also capitalized on 
biological insights to derive new computing paradigms. 

While all of these methods have led to successful applica- 
tions, they only relied on a high-level (and sometimes flawed) 
understanding of the biological processes they were based on, 
and thus they usually did not directly lead to new biological 
insights. Similarly, though novel computational methods have 
been developed to help researchers learn new biology, the 
application of these methods to the biological problem (i.e., 
the biological system itself) rarely fed back to help computer 
scientists design better algorithms. Thus, the two directions- 
relying on biological ideas to develop computational methods 
and using computational methods to study biology — remained 
largely separated (Figure 1). 



Shared principles of computational and 
biological systems 

There are many parallel requirements of computational and 
biological systems, which suggest that one can learn from the 
other (Figure 2) . First, like virtually all large-scale computing 
platforms, biological systems are mostly distributed consisting 
of molecules, cells, or organisms that interact, coordinate, and 
make decisions without central control (Seeley, 2002; 
Babaoglu et al, 2006; Figure 2A). Second, biological processes 
need to be able to successfully handle failures and attacks 
to thrive (Jeong et al, 2000; Kitano, 2004). Robustness is also 
a key property algorithm engineers covet when designing 
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computing systems that persist in noisy environments. Third, 
networks serve as an important medium through which 
interactions occur and information propagates (Alon, 2006; 
Figure 2B). In both settings, the structure of these networks is 



Table I Examples of biological systems that have inspired computational 
algorithms 



Biology 



Computer science References 



Neural networks Machine learning 

Sequence evolution Genetic algorithms 
Sequence evolution DNA computing 



Ant colonies, 
swarms 



Search optimization 



Social insects, flocks Network design 



Immune system 



Visual system 



Modeling and 
optimization 



Dimensionality 
reduction 



Hebb (1949),Hopfield 
(1982), Bishop (1996) 
Goldberg (1989) 
Adleman (1994), 
Benenson et al (2004), 
Istrail et al (2007), 
Paun et al (2004) 
Kennedy and Eberhart 
(2002), Committee on 
Frontiers at the Interface 
of Computing, Biology, 
and National Research 
Council (2005), 
Dorigo et al (2006) 
Deneubourg et al 
(1990), Beckers et al 
(1994), Theraulaz and 
Bonabeau (1995), Coore 
et al (1997), Ferber 

(1999) , Abelson et al 

(2000) , Fewell (2003), 
Werfel and Nagpal 
(2006) 

Forrest et al (1994), 
Kephart et al (1995), 
Garrett (2005), 
Greensmith et al (2005) 
Lee and Seung (1999) 



Most of these examples were based on a high-level understanding of the system 
and therefore usually did not directly lead to new biological insights. 



often directly linked to the system's function. Fourth, 
biological systems are often modular; that is, they reuse 
certain components in multiple, and sometimes very different, 
applications. This is a key design principle in many program- 
ming languages and in large complex networks (Kashtan and 
Alon, 2005; Fortunato, 2010; Figure 2C). Fifth, biological 
processes are often stochastic (Kaern et al, 2005) resembling 
randomized algorithms whose power has been well docu- 
mented for the design of approximation algorithms and whose 
use ranges from single process systems to large distributed 
systems (Motwani and Raghavan, 1996; Vazirani, 2004; 
Figure 2D). 

These shared principles indicate that thinking computation- 
ally about biological systems may lead to better understanding 
of their function. However, there are also differences between 
biological and computational systems. While computational 
systems are often focused on speed, biological systems can 
adapt dynamically to changing environments and pressures 
and have a keen ability to reach joint decisions with very 
limited knowledge of their surroundings. These aspects, while 
desirable, are still not fully realized by computational systems. 
By studying how biological systems operate within these 
constraints and conditions to solve problems, we may be able 
to infer new directions for addressing these issues within 
computational systems. 

With recent advances in our ability to generate and analyze 
biological data, new studies have emerged that have linked 
computational problems to biological problems, including 
cell-fate determination, slime mold tunneling, fault tolerance 
in regulatory networks, and bat navigation (Table II) . In this 
review, we discuss a number of these studies focusing on three 
areas of interest: coordination in large distributed systems, 
network processes and design, and tracking and vision 
(Figure 3). For each area, we highlight several studies that 
demonstrate how computational thinking (Wing, 2006) can 
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Figure 1 Traditional studies versus computational thinking. (A) Traditionally, biologists leveraged computing power to analyze and process data (e.g., hierarchically 
clustering gene expression microarrays to predict protein function), and computer scientists used high-level design principles of biological systems to motivate new 
computational algorithms (e.g., neural networks). Rarely these two directions were coupled and mutually beneficial. (B) By thinking computationally about how biological 
systems process information (Nurse, 2008; Hogeweg, 201 1), we can develop improved models and algorithms and provide a more coherent explanation of how and 
why the system operates as it does. 
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Figure 2 Parallels between computational and biological systems. (A) Biological and computational systems often coordinate and maintain functionality without relying 
on a central controller. (B) Networks are often the medium through which processes spread, either on the Web or in the cell. (C) Modularity is a common design principle 
in programming languages and also serves as the basis for increasing complexity and flexibility in evolving systems. (D) Stochastic processes in computational and 
biological settings help to remain robust in constrained, noisy environments. 



lead to new biological insights and new algorithms. We also 
discuss several additional biological processes for which better 
understanding of how they operate could lead to solutions to 
long-standing computer science problems. 

Note that from the computer science point of view, our focus 
in this review is on algorithms. Another important area which 
has both benefited from, and contributed to, the improved 
understanding of biological systems relates to engineering 
computational systems. We briefly discuss some examples for 
this direction including robotics and synthetic biology. 
However, this direction deserves its own review (s) and is thus 
out of the scope of this paper. 



Coordination 

Coordination is a major challenge for both computational and 
biological processes. At the molecular level, coordination is 
required to activate sets of genes that together respond to 
external conditions (Kaur et al> 2010) . At the cellular level, cells 
coordinate to determine cell fate during development (Afek 
et al, 2011) and to synchronize heart beats (Mirollo and 
Strogatz, 1990). Organisms, including the octopus, need to 
coordinate the activation of limbs (Sumbre et al, 2001), while 
shoals of fish coordinate to avoid predators (Ward et al, 2011). 
In computational systems, coordination is required for 
virtually all large-scale computing systems. Examples include 
search engines that coordinate thousands of back-end servers 
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to quickly respond to user queries (Brin and Page, 1998), 
sensor networks that aggregate data when monitoring 
environments (Main waring et al, 2002; Werner- Allen et al, 
2006), and mobile networks that synchronize data and 
schedules across multiple devices (Bernard et al> 2004) . Below 
we discuss a few examples in detail. 

The fly brain and maximal independent set 

A recent example of a coordination-focused study that jointly 
addressed a biological and computational problem is from 
work on fly brain development. During development of the fly 
brain, some cells in the pre-neural clusters become sensory 
organ precursor (SOP) cells. These cells later attach to small 
bristles on the fly's forehead that are eventually used to sense 
the environment. It has long been known that the distribution 
of SOP and non-SOP cells follow a strict spacing pattern 
(Rendel et al, 1965). In particular, cells that become SOPs 
inhibit their neighbors so that SOPs are surrounded by 
non-SOPs (Figure 3A, left). However, the exact process by 
which such global organization is achieved in a short time and 
by starting with all-equivalent cells remains unknown. 

While studying this problem, Afek et al (2011) made a 
new and interesting observation: the requirements from 
SOP selection — that each cell was either an SOP or directly 
connected to an SOP and that no two SOPs were connected- 
are similar to the formulation of the maximal independent set 
(MIS) problem in computer science. The MIS problem seeks to 
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Table II Examples of new synergistic relationship between biology and computer science 



Area 


Biological system 


Computational problem 


References 


Model 


Algorithm 


Coordination 


Fly SOP selection 


Maximal independent set 


Afek et al (2011) 


is* 






Fireflies flashing 


Synchronization 


Glass (2001), Lucarelli and Wang (2004), 
Hong and Scaglione (2005), Werner- 
Allen et al (2005) , Babaoglu et al (2007) , 
Pagliari and Scaglione (2011) 








* Octopus neural control Hierarchical computing 


Sumbre et al (2001) 


X 


X 




*Fish shoals/honeybees Ensemble methods 


Marshall et al (2009), Conradt (2011), 




X 








Ward et al (2011) 








* Quorum sensing 


Consensus 


(Marshall et al, 2009) 




X 




*DNA replication 


Resource allocation 


Farkash-Amar et al (2008) 


X 


X 




*Mate selection 


Graph matching 


— 


X 


X 


Networks 


Slime mold tunneling 


Network design 


Li et al (2010), Tero et al (2010), 
Watanabe et al (2011) 








Gene regulation 


Fault tolerance 


Gu et al (2003) , Balaji et al (2006) , Gitter 
et al (2009), Pomerance et al (2009) 


is 






* Protein localization 


Routing 


Shapiro et al (2009) 


X 


X 




* Brain networks 


Network design 


Chen et al (2006) 


X 


X 




*Cell signaling 


Clustering 


Wokoma et al (2005), Charalambous 
and Cui (2008) 


V* 




Tracking and 


Visual cortex 


Object recognition 


Riesenhuber and Poggio (1999), Serre 


\S 




Vision 




etal (2007a, b) 








Echolocation in bats 


Localization 


Ghose et al (2006), Yovel et al (2010) 








* Visual cortex 


One-shot machine learning 


Li et al (2006), Serre and Poggio (2010) 


X 


X 



The biological systems are divided into three areas: coordination, networks, and tracking and vision. For each system, we show the analogous computational problem 
and whether a model or algorithm was derived by previous works. Rows beginning with a star denote potential future work. 



elect a set of local leaders in a graph such that all other nodes in 
the graph are connected to a member of the MIS and no two 
MIS members are connected to each other. An MIS in an ad-hoc 
wireless network serves as a routing backbone by which nodes 
can communicate. It also corresponds to a distributed 
partitioning of the nodes into clusters that can be used to 
optimize network bandwidth and resource distribution in the 
network. Thus, this is an important practical problem that has 
been studied for more than two decades. However, the solution 
used by the fly seemed different from all previous algorithms 
suggested for this task (Luby, 1985; Alon et al, 1986). In 
particular, the fly's solution relied on very short messages and 
did not use any information regarding the number of neighbors 
that can still become SOPs. 

To study how the fly was still able to solve the SOP selection 
problem under such limitations, Afek et al (2011) used 
microscopy experiments to follow SOP selection in developing 
flies. Coupling the results of these experiments with possible 
molecular processes for the activation of the specific proteins 
known to be involved in the SOP selection process, they 
discovered that a stochastic feedback process, in which 
selection probability increases as a function of time, provides 
the best match to the experimental results. Based on these 
findings, they were able to develop a new computational 
algorithm for MIS selection that is more efficient and more 
robust than previous methods. Unlike previous MIS selection 
methods (Luby, 1985; Alon et al, 1986), the fly-based method 
does not use any knowledge about the number of neighbors a 
node has (its degree). Instead, with probability that increases 
exponentially over time, each node that has not already 
connected to an MIS node proposes itself as an MIS node. This 
leads to a selection pattern in which denser areas in the graph 
are assigned MIS nodes early on and less populated areas are 
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assigned at later stages (see Figure 3A). While the original 
algorithm of Afek et al (2011) required that nodes know an 
upper bound on the total number of nodes in the network, a 
new version of this algorithm (Afek et al, 2011) removes this 
requirement, leading to the first MIS selection algorithm that 
does not require any knowledge of the network or its topology. 
This fly-based method is specifically appropriate for sensor 
networks in which nodes might be randomly and dynamically 
distributed in an environment and have no knowledge of 
global topology. Thus, by studying a very specific cellular 
developmental system, the authors helped resolve a long- 
standing computer science problem and improved our under- 
standing of cell-fate determination. 

Additional coordination problems 

Several additional coordination problems have been studied, 
separately, in biology and computer science. The octopus 
presents a particularly interesting case with its eight arms. 
From what we know (Mather et al, 2010), the octopus employs 
a hierarchy of coordination elements. At the top level, the brain 
is in charge of what arm to use when, for example, seeking 
prey; however, at the lowest level each arm appears to act 
independently, adjusting its grip, size, and rotation without 
consulting the brain (Sumbre et al, 2001). It is believed that a 
large number of neurons in the octopus are outside the brain 
and in the arms which is highly unusual for vertebrates 
(Mather et al, 2010). Further, Mather et al (2010) claim that 
there is a third intermediate layer that allows coordination 
and communication between neighboring arms without 
going through the brain. Hierarchical distributed systems 
have also been studied as a means to improve load balancing, 
scheduling, and fault tolerance (Feitelson, 1990). Though no 
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Figure 3 Examples of biological systems and their analog computational problems. Bidirectional studies have led to insights into how biological systems operate, while 
at the same time advancing computational methods. These studies have explored coordination issues (A), network design principles (B), and vision problems (C). See 
text for complete details. 



works that we are aware of has attempted to integrate these 
studies, lessons learned from the octopus can potentially be 
used to improve these systems, while also helping to better 
understand how different layers of control collectively 
coordinate actions. 

Another example of a problem pertinent to biological and 
computational systems is distributed consensus (Simeone 
et al y 2008). Consensus is a building block in several 
computational procedures and is known to be an extremely 
hard problem for massive, asynchronous, and failure-prone 
systems (Lynch, 1996). In biology, consensus has an important 
role in coordinating populations. Fish, for example, are 
constantly under the threat of predation and must be able to 
quickly sense the environment and make decisions. The 
formation of groups can aid in collective vigilance because 
surveillance can be distributed among a set of fish. In addition, 
the group is likely to avoid individual errors by pooling 
sensory data together. However, the group also bears 
additional communication costs of propagating information 
to other members of the group, which delays decision making 
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(Ward et al, 2011). Ward et al (2011) have empirically shown 
that fish make faster and more accurate decisions under 
predation threat in large groups (shoals) compared with 
smaller groups. The self-organization of shoals, the authors 
argue, naturally emerges from quick reaction to local 
neighbors that spreads quickly through the shoal. Honeybees 
have also been shown to optimally minimize the decision time 
for a desired level of accuracy (under certain models) when 
deciding between candidate new homes (Marshall et al, 2009) . 
Here, honeybees also attempt to balance between accuracy 
(choosing the best home) and speed (time to make a decision) . 

Finally, synchronization is another form of coordination 
that many biological systems rely on, including pacemaker 
cells in the heart (Kuramoto and Yamagishi, 1990) and fireflies 
that flash in unison (Winfree, 1967; Mirollo and Strogatz, 
1990). By developing models and methods based on studying 
how fireflies synchronize (Mirollo and Strogatz, 1990), the 
theory of pulse-coupled oscillators has flourished and has been 
used in wireless sensor networks (Lucarelli and Wang, 2004; 
Hong and Scaglione, 2005; Werner-Allen et al, 2005; Pagliari 
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and Scaglione, 2011) and in overlay networks (Babaoglu et al, 
2007) for clock management purposes and to integrate time- 
sensitive data. Algorithms based on these systems have 
resulted in faster convergence times and improved our under- 
standing of biological self-stabilization (Daliot et al, 2003, 
2004). In all cases, as modeling efforts continue, it is possible 
that better distributed ensemble protocols can be developed, 
which may lead to new testable biological hypotheses. 



Networks 

Networks underlie many computational systems. Understand- 
ing the structure and dynamics of networks such as the 
Internet, social networks, and mobile networks has been the 
focus of several recent studies (Easley and Kleinberg, 2010). 
Key questions studied include how to best design such 
networks, how to determine what are the important nodes, 
how to route messages efficiently from one node to another, 
and how to maintain functionality despite errors or noise. 
These issues have also recently been the focus of several 
biological studies that have investigated how entities in 
molecular (signaling, regulatory, and metabolic) networks, 
cellular networks in the brain, and organism-level networks 
collectively interact and self-organize to efficiently accomplish 
tasks. We highlight a few recent examples below. 

Slime mold tunneling for network design 

When the slime mold Physarum polycephalum forages in an 
environment, it grows to become a tubular structure that links 
all the resources it finds together via direct connections or 
intermediate junctions (Figure 3B, left). Evolution has 
designed the network such that sources are connected via 
efficient paths, while also ensuring minimal production cost 
and robustness to disconnection. To better understand the 
design principles balancing these requirements, Tero et al 
(2010) created a map of 36 food sources ('cities') placed on a 
map of the Tokyo area. Remarkably, they found that the slime- 
mold designed network and the actual Tokyo rail system were 
very similar. In particular, both networks had comparable cost 
(rail length), efficiency (short paths), and resiliency (redun- 
dant paths) — all under various definitions and models of these 
factors. What is interesting, however, is that the slime mold 
designed its tunneling system in a completely distributive 
manner without any global coordination. Clearly, the same 
cannot be said of the rail system. 

Inspired by the characteristics of this process, Tero et al 
(2010) developed an algorithm for adaptive network construc- 
tion based on driving flow through the network and 
dynamically selecting for preferential tubes (Figure 3C, right). 
They show that their model can produce networks that closely 
resemble the actual networks grown by the slime mold. Such a 
model — which mathematically captures how the slime mold 
self-organizes, self-optimizes, and self-repairs itself (Marwan, 
2010) — can be used to quantify the trade-off between various 
desired network properties. These principles can be applied to 
designing mobile communication networks, in which various 
nodes or devices need to maintain communication pathways 
under a changing and dynamic environment. Indeed, follow- 
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up work has shown that a model inspired by the slime mold 
foraging strategy can be used to design path formation 
protocols in wireless sensor networks (Li et al, 2010) and 
optimize traffic distribution in transportation networks (Wa- 
tanabe et al, 2011) . On the other hand, mapping the slime mold 
tunneling process to a network design problem has also led to 
a greater understanding of how the slime mold balances 
conditions and constraints when foraging. 

Gene backup and fault tolerant systems 

Errors, failures, and attacks commonly occur in molecular 
processes. Examples includes DNA replication errors, protein 
failures due to misfolding or mutations affecting transcription, 
and pathogens that enter the cell and interfere with normal 
function. DNA repair, backup, and innate immunity are all 
cellular processes that have been remarkably designed to 
identify and overcome these issues. Gaining a better under- 
standing of how these systems operate is important for many 
health-related studies. 

Fault tolerance is also a critical design principle of 
computational systems. Consider a popular website backed 
by thousands of servers. At any time, one or more of these 
servers may fail due to error or malicious attack. Overcoming 
these issues and making them transparent to the user (by, e.g., 
redirecting queries to active servers that also act as backup 
mechanisms; Guerraoui and Schiper, 1996) is an important 
algorithm design problem. Improving these techniques is key 
in creating better and more usable distributed systems. 

So, how do molecular systems deal with failures? Recent work 
suggests that molecular systems do not in fact use dedicated 
failure detection mechanisms. Rather, they seamlessly integrate 
backup resources into their design. Using single gene-deletion 
mutants in S. cerevisiae, Gu et al (2003) found that knocking out 
duplicate genes (genes with paralogs) resulted in a higher 
probability of functional compensation (i.e., higher fitness) than 
when knocking out non-paralogous genes. 

Gitter et al (2009) reported a similar result in gene regulatory 
networks. These networks encode the relationships between 
transcription factors (TFs) and the genes they regulate. When 
knocking out TFs with a strong paralog, it was observed that the 
expression level of its regulated genes did not significantly 
change, due to compensation by the TF's paralog. On the other 
hand, when knocking out a singleton TF, there was a bigger 
change in expression of direct targets due to the lack of a backup 
mechanism. Balaji et al (2006) propose a fault tolerant 
computational model based on the design of regulatory net- 
works. To create a network that is fault tolerant to hub deletion 
they propose to take the original network and do a transforma- 
tion (multiplying A' A where A is adjacency matrix for the 
network) . In the resulting network, two nodes (or TFs) will be 
connected if they share many common neighbors in the network. 
This common-neighbors structure arises naturally in biological 
systems from the duplication-divergence model discussed below. 

Together, these studies show how backup is naturally built 
into decentralized molecular systems. Evolutionarily, it is 
possible that backup naturally emerged from the process of 
duplication and divergence. In this model, a gene is duplicated 
to yield a functional equivalent, which over time diverges to 
specialize in a similar subtask. Thus, the gene and its duplicate 
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share similar sequence and interaction partners and either can 
be used as a backup gene for the other. The challenge, from the 
design standpoint, is to determine which local nodes (genes) 
should be replicated in order to globally minimize the effect of 
errors. Recent work on robustness of networks to node and 
edge failures (Ciliberti et al, 2007; Zhong et al, 2009) may 
provide further information on how such ideas can be applied 
to computational systems. 

Additional network problems 

Effective routing of proteins to their designated subcellular 
locations is another problem that may benefit from joint 
computational and biological modeling. It is known that routing 
occurs via signals encoded in the amino-acid sequence of the 
protein (Shapiro et al, 2009) ; however, we are not aware of any 
model of distributed routing inspired by protein localization. 
Such a model, as with most other biological systems, might have 
several desired properties, such as efficiency and fault tolerance 
to specific conditions or environments. 

Another network that holds much promise for future work is 
the network of neurons that are active in the brain. As 
mentioned previously, neural networks represent a computa- 
tional optimization framework that is based on these net- 
works. However, with our increased ability to study brain 
activity (Sporns, 2010), we can start addressing other ques- 
tions about the design and architecture of real neural 
networks. For example, some models have been proposed to 
generate networks with brain-like connectivity (Kaiser and 
Hilgetag, 2004; Koene et al, 2009) while minimizing wiring 
and path lengths (Chen et al, 2006) . What local rules do these 
models employ and what do they tell us about why the brain is 
organized the way it is? 

Vision and tracking 

The visual system in the mammalian brain has an exceptional 
ability to process imagery and build a coherent representation 
of the world. It can seamlessly detect and track objects, 
reconstruct scenes, and restore images (Shapiro and Stockman, 
2001), all in real time and under various amounts of noise, 
lighting conditions, and transformations. No computer algo- 
rithm today comes close to matching its performance. Under- 
standing how this system works and replicating it in the 
computer has been a long sought goal of biology and computer 
scientists, respectively. By teaching a computer how to 'see', 
automated systems can be designed for face detection and 
authentication, 3D model building, medical image processing, 
tracking, and surveillance, among many more (Szeliski, 2010) . 

Bat echolocation and tracking 

Bats are notoriously well versed at tracking objects at night. 
Using their highly evolved sonar system, bats rely on echoes to 
forage and navigate efficiently at night. In biology, pursuit is a 
basic task to all prey-seeking animals. In computer science, 
localization is an important problem with many applications 
to defense and guidance (Figure 3C, left) systems. A common 
strategy for localization is to keep the stimulus of interest 
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constantly centered in the field of view. However, recent 
results from bats led to the discovery that this common 
perception is not optimal. 

Yovel et al (2010) studied how Egyptian fruit bats use 
echolocation to home-in on a target (Figure 3C, left). They 
measured the direction of the bats' sonar clicks in darkness and 
found that, instead of centering the sonar beam onto the target, 
they alternated the beam to the left and right of the target and 
thereby pointed the maximum slope of the beam onto the 
target. As a result, any subsequent motion of the object 
(relative to the bat) will maximally change the echo intensity 
and the sign of the change will indicate the direction of motion. 
They show that this strategy results in a less jittery path (i.e., a 
smoother trajectory) to the target that is easier and quicker to 
follow. In fact, the authors proved that this strategy is optimal 
with respect to localization, though it does pay some cost of 
possible misdetection (since less sound is emitted by not 
pointing the beam directly at the target; see Figure 3C, right). 

This study shows that even high-level biological processes 
can reveal interesting and computationally relevant models 
that can help clarify the logic of biological actions. 

The visual cortex and computer vision 

Visual processing in the brain has long been modeled by 
hierarchical models. By encoding objects at varying levels of 
sophistication, hierarchies are believed to be capable of 
recognizing objects under various scales, perspectives, and 
natural conditions. Different levels of the hierarchy respond to 
stimuli that become increasingly complex; neurons in the 
lower levels respond to simple bars and shapes, whereas 
neurons in the upper levels respond to more complex 
combinations and whole objects (Serre and Poggio, 2010). 
Recently, Riesenhuber and Poggio (1999) and others (Serre 
et al, 2007a, b) 'opened up' the brain to see how well such 
hierarchical models fared to actual neurophysiological data of 
neuron activation patterns in the inferotemporal cortex in the 
macaque brain. From these data, they proposed a new 
hierarchical feed-forward model for object recognition that 
adheres better to anatomical and physiological constraints and 
that better matched the corresponding neural recordings. The 
model is based on a novel way to pool or integrate responses 
from lower level neurons by identifying the strongest, most 
active signal present. This makes recognition robust to random 
noise and fluctuation. By representing features that are 
invariant to transformations, the model also illustrates how 
the visual system is able to recognize objects across different 
scales given only a single view at one scale. From the 
computational standpoint, the model has been shown to 
perform well for robust object recognition, action recognition, 
and face processing compared with many previous (non- 
biologically inspired) techniques (Jhuang et al, 2007; Serre 
et al, 2005, 2007a; Meyers and Wolf, 2008). 

Additional tracking and vision problems 

Research in computational neuroscience holds much promise 
for inspiring new models of visual processing (Serre and 
Poggio, 2010). While hierarchical feed-forward models were 
shown to closely replicate neural recordings for immediate 
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recognition (before eye movement and high-level processing 
occurs), feed-back projections are also extremely prevalent in 
the cortex and must be incorporated to model visual perception 
beyond immediate recognition (Serre and Poggio, 2010). Better 
understanding of how feed-forward and feed-back connections 
interface with one another will likely lead to better algorithms 
and better understanding of how more advanced processing 
(e.g., memory and learning) occurs in the brain. 

Another area where neuroscience may be helpful is in 
learning. Humans have a remarkable ability to recognize and 
detect objects very quickly (Thorpe et al, 1996) and using only 
a few training examples. Most computational classifiers today, 
however, typically require hundreds or even thousands of 
training examples to robustly learn discriminative features. It 
may be possible to develop additional one-shot machine 
learning algorithms (Li et al, 2006) by studying how visual 
processing circuits in the brain generalize concepts. There has 
also been work on drawing parallels between human and 
machine semisupervised learning (Zhu et al, 2007) . In return, 
these computational models may lead to better theories of 
human learning that quantify its limitations and biases. 

Engineering-based studies 

While our focus in this review is on algorithmic aspects of 
bidirectional studies, another area that has proliferated in 
recent years involves the engineering of various systems by 
tightly linking computational thinking and biology. We briefly 
mention a few examples here, though as mentioned above 
these deserve a separate review. One example of such an 
approach is recent studies in robotics in which researchers aim 
to mimic several aspects of living organisms for tasks that 
include flying (Franceschini et al, 2007), crawling (Hatton and 
Choset, 2010) and distributed control of arm movements 
(Laschi et al, 2009). These studies often involve careful 
analysis of how these tasks are performed by organisms 
ranging from birds to snakes to octopuses. In some cases, such 
an analysis can also lead to new insights about biological 
systems. For example, Franceschini et al (2007) recently 
developed a robotic helicopter that demonstrated how flying 
insects manage to avoid the ground by using an optic-flow 
regulator, which shed new light on unexplained findings 
regarding insects' visually guided performances. 

Synthetic biology and DNA computing are another biocom- 
putational area that has received tremendous interest recently. 
Here, the goal is either to engineer novel systems using 
biological components (synthetic biology) or to use molecules 



to perform computations (DNA computing). In many cases, 
these biological constructs utilize ideas that were previously 
used when designing computational systems. One of the first 
studies that demonstrated the computational power of biology 
was done by Adleman who showed how DNA can be 
manipulated to solve complex graph-theoretic optimization 
problems (Adleman, 1994; Paun et al, 2004). More recent 
studies have focused on the logical control of gene expression 
(Benenson et al, 2004), the construction of synthetic genomes 
and cells (Tian et al, 2004; Gibson et al, 2010), and the 
replication of neural systems using DNA computing (Qian 
etal, 2011). 

Discussion and conclusions 

The mutual relationship between biology and computer 
science goes back a long time. However, until recently, the 
application of computational methods to study biological 
processes and the influence of biological systems on the design 
of algorithms remained mostly separate. The former involved 
developing computational methods to solve specific biological 
problems, while the latter primarily viewed biological systems 
as motivation to derive new optimization methods. 

With the improvement in our ability to study detailed 
aspects of molecular, cellular, and organism-level biological 
systems, it is now possible to unite these two directions. This 
direction involves thinking about biological processes as 
algorithms that nature has designed to solve difficult real- 
world problems under a variety of constraints and conditions. 
By viewing these systems as information processing units, we 
can better understand their biological properties and their 
survival mechanisms, and can use the gained knowledge to 
design and improve computational systems. 

Over the last few years, this strategy has been successfully 
used in a number of different application areas including 
coordination, networks, and tracking and vision. The list of 
studies in Table II is not meant to be comprehensive but rather 
is meant to point out various systems that have been studied 
this way and additional systems that are ripe for analysis. 
Table III lists the shared principles that were identified as the 
basis for some of these studies. Such studies usually begin by 
identifying a problem addressed by both computational and 
biological systems. For example, methods to cluster networks 
have been widely studied in computer science (Schaeffer, 
2007; Fortunato, 2010); a number of biological processes, 
including lateral induction during intercell signaling (Char- 
alambous and Cui, 2008) and decision-making processes by 



Table III The five primary studies highlighted in this review (rows) each annotated with the principles it shares with computational systems (columns) 



Distributed Robust Networks Modular Stochastic Adaptive 



Fly SOP selection W W 

Slime mold tunneling i> 

Gene regulation W 

Bat localization x i> 

Brain processing i> i> 



w i> x 

)S x i> ]S 

w w x w 

X X X is* 

W W X W 



Most biological systems operate distributedly and seek a design that is robust and adaptive to changing environments. Networks often serve as a basis for carrying forth 
interactions and propagating information. These similarities provide a deep basis for the shared analysis of biological processes and computational algorithms. 



8 Molecular Systems Biology 2011 



© 2011 EMB0 and Macmillan Publishers Limited 



Convergence of biology and computer science 
S Navlakha and Z Bar-Joseph 



quorum-sensing bacteria (Wokoma et al, 2005; Ren and Meng, 
2006), perform a very similar task. The next step, before 
designing experiments, is to think about the biological 
system's computational requirements. For example, is the 
system distributed? What is known about how communication 
occurs within the system? What qualities (e.g., robustness and 
efficiency) may the system be trying to optimize? Does 
randomness or stochasticity have a role? Such questions can 
help pinpoint particular biological questions regarding the 
function of the process and may provide a unique point of view 
for even well-studied processes. Next, a model is often 
constructed based on the experimental results and this model 
is used to design an improved solution to the computational 
problem. In the clustering example, perhaps the model can 
lead to improved clustering algorithms that operate in noisy 
and distributive environments. 

In this review, we argued that by adopting a 'computational 
thinking' approach to studying biological processes, we can 
improve our understanding of biological processes and at the 
same time improve the design of algorithms. 
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